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ABSTRACT 

We present a new proxy for the overdensity peak height for which the large-scale 
clustering of haloes of a given mass does not vary significantly with the assembly 
history. The peak height, usually taken to be well represented by the virial mass, 
can instead be approximated by the mass inside spheres of different radii, which in 
some cases can be larger than the virial radius and therefore include mass outside the 
individual host halo. The sphere radii are defined as r = a S t + b logio (Mvir / M n i) , 
where 6t is the age relative to the typical age of galaxies hosted by haloes with virial 
mass M V i r , M n i is the non-linear mass, and a — 0.2 and b = —0.02 are the free 
parameters adjusted to trace the assembly bias effect. Note that r depends on both 
halo mass and age. In this new approach, some of the objects which were initially 
considered low-mass peaks (i.e. which had low virial masses) belong to regions with 
higher overdensities. At large scales, i.e. in the two-halo regime, this model properly 
recovers the simple prescription where the bias responds to the height of the mass peak 
alone, in contrast to the usual definition (virial mass) that shows a strong dependence 
on additional halo properties such as formation time. The dependence on the age in 
the one-halo term is also remarkably reduced with the new definition. The population 
of galaxies whose "peak height" changes with this new definition consists mainly of old 
stellar populations and are preferentially hosted by low-mass haloes located near more 
massive objects. The latter is in agreement with recent results which indicate that old, 
low-mass haloes would suffer truncation of mass accretion by nearby larger haloes or 
simply due to the high density of their surroundings, thus showing an assembly bias 
effect. The change in mass is small enough that the Sheth et al. (2001) mass function 
is still a good fit to the resulting distribution of new masses. 
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1 INTRODUCTION 



Many recent models of galaxy formation assume that galaxy 
properties are determined by the haloes in which they form 
and not by the surrounding larger-scale environment (e.g. 
Kauffmann et al. 1997; Berlind et al. 2003; Yang et al. 2003; 
Baugh et al. 2005). In this picture, the galaxy population 
in a halo of a given mass is independent of where the halo 
is located. This is justified by the standard description of 
structure formation, namely the extended Press-Schechter 
theory (EPS, Bond et al. 1991; Lacey & Cole 1993; Mo & 
White 1996), which was in turn based on both linear growth 
theory of density per turbations of an initial Gaussian ran- 
dom fluctuation field l|Press fc Schechter 1974) and the non- 
linear spherical collapse model. Furthermore, simulation re- 



sults as recent as Percival et al. (2003) indicated that the 
halo clustering should only depend on the mass0 

However, a few years ago, it was shown that galaxy 
properties such as the star formation rate (Gomez et 
al. 2003; Balogh et al. 2004; Ceccarelli et al. 2008; and 
Padilla, Lambas, Gonzalez 2010 in observations; Gonza- 
lez &i Padilla 2009 in nu merical simulations) and colours 
jGonzalez fc Padilla 20091 ') depend on the large-scale struc- 
ture. Gomez et al. (2003) found that, for a sample of galax- 
ies in groups and clusters from the Sloan Digital Sky Sur- 
vey (SDSS), the star formation rate decreases, compared 
with the field population, starting at ~ 4 virial radii to- 



1 Although these authors mention that a systematic difference 
between the clustering of the set of all haloes of a given mass and 
any of their subsamples could be hidden within the noise at a 
level below 20 per cent in the bias. 
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ward the cluster centre. Gonzalez & Padilla (2009) used 
a semi-analytic model of galaxy formation and found that 
the fraction of red galaxies diminishes for galaxies farther 
away from clusters (or closer to voids) in environments with 
the same local density. These results support the view that 
galaxy populations also depend on the larger-scale environ- 
ment, both in models and observations. 

Regarding the fact that haloes of the same mass should 
essentially exhibit the same properties, Gao et al. (2005, 
hereafter G05) measured that the large-scale clustering of 
haloes of a given mass depends strongly on the formation 
time, for halo masses M ^ 6xl0 12 h~ x Mq. This study, 
based on iV-body simulations, showed that haloes assembled 
at high redshift are more strongly correlated than those of 
the same mass that assembled recently. This effect, which 
is not expected from the excursion set theory, was termed 
"assembly bias," which consists in that the large-scale clus- 
tering of haloes of a given mass varies s ignificantly with their 
assembly history l|Gao fc White 20071 ). 

On the observational side of the assembly bias, Wang et 
al. (2008) found that groups selected from the SDSS with red 
central galaxies are more strongly clustered than groups of 
the same mass but with blue centrals, being this effect much 
more important for less massive groups. In addition to the 
clustering amplitude, Zapata et al. (2009) found that galaxy 
groups of similar mass and different assembly histories show 
differences in their galaxy population, for example in the 
fraction of red galaxies. Furthermore, Cooper et al. (2010) 
studied the relationship between the local environment and 
properties of galaxies in the red sequence. After removing 
the dependence of the average overdensity on colour and 
stellar mass, they still found a strong dependence on the 
luminosity-weighted stellar age. Galaxies with older stellar 
populations occupy regions of higher overdensities compared 
to younger galaxies of similar colours or stellar masses. The 
latter results show that the concept of assembly bias could 
be applicable to galaxies in addition to dark matter haloes, 
and would then affect the physics of galaxy formation. 

Other halo properties such as concentration, num- 
ber of subhaloes, subhalo mass function, shape, halo 
spin, major merger rate, triaxiality, shape of the veloc- 
ity ellipsoid, and velocity anisotropy at a given mass 
show an assembly-type bias effect in cosmological N- 



body simulations ( 


Wechsler et al. 20061; Zhu et al. 200£ 


Croton et al. 2007; 


Bett et al. 20071; iGao & White 2007 


Hester & Tasitsiomi 2010; Faltcnbachcr & White 2010). 



The reasons for this assembly bias are not yet fully un- 
derstood. EPS assumes no such environmental dependence. 
At a fixed mass, the Markovian nature of the random walk 
trajectories of perturbations smoothed at higher resolution, 
which characterise a halo, is assumed to be independent of 
the environment encoded in random walks at lower resolu- 
tion. Thus, halo properties should not be related with the 
external environment in haloes of equal mass. These random 
walks are obtained using a top-hat Fourier-space window 
function to smooth (or to average) the density fluctuations; 
this filter in fc-space allows to obtain an analytic expression 
of the halo mass function that is equal to the Press-Schechter 
formula. There have been attempts to modify this window 
function to consider an environmental dependence. Zentner 
(2007) combined a Gaussian window function and a vari- 



able height of the barrier for collapsed objects, but found an 
opposite trend for the assembly bias at low masses. 

Furthermore, correlations between halo parameters do 
not simply show the same clustering behaviour. Bett et al. 
(2007) found that both the most nearly spherical haloes and 
those with highest spins are more strongly clustered than 
the average. However, this fact contradicts the correlation 
between spin and shape, where more spherical haloes have 
on average a slightly lower spin parameter. Also, for exam- 
ple, the work by Croton et al. (2007) showed that there are 
aspects of the assembly history which are not encoded in 
halo concentration or formation redshift and which corre- 
late with the large-scale environment. One possible expla- 
nation was suggested by Dalai et al. (2008). They claim 
that the halo assembly bias is related to the peak curvature 
of Gaussian random fields in high-mass haloes, whereas at 
the low-mass regime the bias arises from a subpopulation 
of low-mass haloes whose mass accretion has ceased. These 
haloes could have been ejected out of nearby massive haloes 
(Ludlow et al. 2009). Wang et al. (2009) found that these 
ejected low-mass subhaloes have earlier assembly times and 
a much higher bias parameter than normal (not ejected) 
haloes of the same mass, so that they contribute to the as- 
sembly bias. However, they also found that the assembly 
bias is not dominated by this population, indicating that 
effects of the large-scale environment on "normal" haloes is 
the main source for this bias. 

Despite the fact that halo mass continues to be the most 
important parameter to determine the galaxy properties, it 
is relevant to study the assembly bias to gain further insights 
on the development of the Large-Scale Structure. This is par- 
ticularly significant when galaxies are used to constrain cos- 
mological parameters, as shown by Wu et al. (2008) in their 
study of the effects of halo assembly bias on galaxy cluster 
surveys. They used the halo concentration to find that up- 
coming photometric projects such as the Dark Energy Sur- 
vey (DES) and the Large Synoptic Survey Telescope (LSST) 
can infer significantly biased cosmological parameters from 
the observed clustering amplitude of galaxy clusters if the 
assembly bias is not taken into account. 

Hester & Tasitsiomi (2010, hereafter HT10) found an 
assembly-type bias in that the rate of major mergers of 
haloes of a given mass changes with the local environment. 
They proposed a dynamical explanation for this effect, par- 
ticularly for high densities, based on both tidal stripping, 
responsible for the decrease in the major merger rate of 
galaxy-like haloes, and interactions between bound haloes 
in the outskirts of groups, which are related with the in- 
crease in the merger rate in group-like haloes. This plausible 
explanation applies on scales of out to ~ 250 kpc. 

We suggest that, if the initial peak did not collapse 
completely onto haloes, their mass will not be an appro- 
priate proxy for the peak height. They will present old ages, 
which would not be the case if the peak finished its collapse 
onto the halo (it would look younger dynamically). There- 
fore, the scale out to which we need to extend the inclusion 
of mass for the peaks could be as large as or even larger 
than the scale proposed by HT10 since, at low z, the initial 
overdensity may be still spread in larger areas around the 
current collapsed halo. Wang et al. (2007) mention a sim- 
ilar idea in that old, low-mass haloes were part of higher 
peaks in the initial density field than what is revealed by 
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their present-day virial mass. By means of a semi-analytic 
model, we will show that, at the present time, the assembly 
bias may well be related with the infall region of a halo for 
scales 80 kpc < r/h" 1 < 1.5 Mpc, a range where the one- 
halo clustering amplitude between populations of the same 
mass but different ages differs strongly (see Section [331 . 

The aim of this work is to understand the origin of the 
assembly bias and its role in the development of the large- 
scale structure and on the galaxy population, beyond the 
halo mass dependence. In order to reach this goal, we will 
study this effect on the semi-analytic galaxies of the Lagos, 
Cora, & Padilla (2008) model. This will allow us to compare 
our results with those obtained from observational data in 
future work, so as to provide another test for the ACDM 
model of the Universe. Also, we will show that it is funda- 
mental to include the global effect from large scales on the 
peak height estimate. The results obtained with this proxy 
of the peak height will be compared with those obtained 
from the virial mass of host haloes by means of the spatial 
two-point correlation function and infall velocity profiles for 
galaxy samples of different relative ages. This new defini- 
tion of an overdensity peak height will not be subject to the 
assembly bias seen at large separations, thus objects of the 
same mass but different ages will show essentially the same 
clustering in the two-halo regime. Given that the assembly 
bias has also been detected separating samples according to 
several other parameters than the halo age, in subsequent 
papers we will also investigate its prevalence when studying 
galaxies and haloes of different concentrations, number of 
satellites, sphericity, and whether our proposed explanation 
for the assembly bias also responds when using observational 
data. 

The outline of this paper is as follows. In Section [2l we 
introduce our simulation. We then perform the statistics of 
density fields for the simulation in Section [3] to compare our 
results with those from other authors that show the assembly 
bias effect. The redefinition of the overdensity peak height 
by using the two-point correlation function and the infall 
velocity profile is developed in Section [4] The nature of the 
objects that are being considered with this redefinition are 
shown in Section[5] Finally, we discuss our results in Section 
[6] The cosmology used here is Sltot = 1, fi m = 0.28, S1a = 
0.72, as = 0.9, h — 0.72, unless otherwise indicated. 



effect for galaxies of a wide range of luminosities and, also, 
for dark matter haloes of a wide range in mass. 

For galaxies, we will use the mass-weighted stellar age 
or, simply, stellar age defined as 



t = t 



(1) 



where to is the age of the Universe today, ti is the time cor- 
responding to the i th output of the simulation, and M s tar 
is the star formation rate calculated using the stellar mass 
AMstar accreted in a time step Aij. We use this parameter, 
the stellar a ge, as it can be directly obtained from obser va- 
tional data l)Kauffmann et al. 20031 : iGallazzi et al. 20051 ). 

On the other hand, the formation redshift of a dark 
matter (DM) halo is defined as the redshift when it assem- 
bled 50 per cent of its final mass at z = 0. It is important 
to mention that the assembly bias has been detected by 
using this definition of age for DM haloes, and thus it 
will be used in this work. There are other definitions that 
show a weaker or absent dependence of halo clustering 
on the halo formation time, as was shown by Li et al. (2008) . 



2.1 Age parameter 

To study the assembly bias, which consists in that old haloes 
have a higher clustering than young haloes of the same mass, 
it is not convenient to work directly with the stellar or halo 
formation age because they correlate with the mass. For 
example, massive dark matter haloes have, on average, older 
stellar populations (Figure [1} . We need a definition of age 
which is independent of the mass. This is very important 
if we want to study galaxies in haloes of a wide range of 
masses. For example, age maps could show old objects in 
regions inhabited only by massive haloes. Motivated by this 
problem, the first step is to find a proxy for a non-mass- 
dependent age. One way to do this consists in using ages 
relative to the median stellar age as a function of the host 
DM halo mass. We define the St dimensionless parameter, 



U - (t(M)) 
at(M) ' 



(2) 



2 DATA 

We use the SAG2 model by Lagos, Cora, & Padilla (2008; 
see also Lagos, Padilla, &c Cora 2009), which combines a 
cosmological A^-body simulation of the concordance ACDM 
universe and a semi-analytic model of galaxy formation. The 
numerical simulation consists of a periodic box of 60 h^ 1 
Mpc on a side that contains 256 3 dark matter particles with 
a mass resolution of ~ 10 9 h^ 1 M@. The galaxy population 
in the semi-analytic model is generated using the merger 
histories of dark matter haloes. One of the main features 
of this model is the implementation of the Active Galactic 
Nuclei (AGN) feedback, which reduces star formation by 
quenching the gas cooling process, an important effect on 
massive haloes at low redshifts. 

One of the most important parameters throughout this 
work is the age. We will use it to study the assembly bias 



where, for the I th galaxy, ti is its stellar age, (t(M)} is 
the median stellar age as a function of host halo mass (red 
squares connected by the solid line in Fig. [T]), with M be- 
ing the virial mass, and <7t(M) the dispersion around the 
median in units of time (error bars in Fig. [T]). In the case 
of DM haloes, ti is the formation redshift. This definition 
implies that objects with positive (negative) values of St lie 
above (below) the median stellar age or formation time for 
a population of a given mass. Then, positive values of 8 t 
correspond to older objects, whereas negative values of St 
are related to younger objects. The histogram in Figure [2] 
shows the distribution of the St parameter for galaxies in 
different mass bins. The shape of the distribution of St is 
similar among them. Also, the median host halo mass for 
S t < and S t > is similar, < M > ~ 1.7 x 10 10 h' 1 M Q , 
confirming that this parameter is independent of the DM 
halo mass. 
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Figure 1. Stellar age as a function of the virial mass (logarithms 
are base 10 throughout) for the galaxies in the simulation. Due 
to the large number of available galaxies, only 5,000 of them, 
randomly chosen, are plotted as points. Red squares are the me- 
dians for each mass bin. Error bars correspond to the 10 and 90 
percentiles of the stellar age distribution. The median stellar pop- 
ulation of low-mass dark matter haloes is younger than that of 
the massive DM haloes. 



3 STATISTICS OF DENSITY FIELDS 

In this section we present the two-point correlation function 
which allows us to calculate the clustering of haloes and 
galaxies, measured directly from the simulation and from 
theoretical expressions for the power spectrum. 



3.1 The two-point correlation function 

The correlation function, £(r), is a useful quantitative mea- 
sure of the spatial clustering. It gives the excess probability 
for finding pairs of particles at a given separation relative 
to a Poisson distribution. The distribution for two points 
separated by a distance r, with respective volume elements 
dVi and dV-2, is given by 



dP = n 2 {l + Z(r)]dV 1 dV2, 



(3) 



where n is the average number density of points. 

In practice the estimator used, particularly for numeri- 
cal simulations with periodic boundary conditions, is 



DD(r) = RR(r)[l + f(r) 



and then 



DD(r) 
RR[r) 



(4) 



Here, DD(r) represents the frequency of the data pairs, 
whereas RR(r) corresponds to the frequency of random 
pairs, defined as 
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Figure 2. Histograms of the St parameter for the different mass 
ranges indicated in the figure key. The three distributions are sim- 
ilar and cover the full range of St . This parameter is independent 
of the DM halo mass. 



RR(r) = N ae ,Ntot 



V(r) 

Vbox 



(5) 



where N se i is the number of selected objects in a given sam- 
ple, Ntot is the total number of objects, and V(r) is the 
volume in a shell at distance r which is normalised by the 
volume of the box V oox in the simulation. In the case of an 
auto-correlation function, Ntot = N se i. 

The cross-correlation function estimates the clustering 
amplitude between two different data sets. We will calculate 
this function for a selected sample against all the available 
objects in the simulation because it will have a higher signal 
than the correlation between the same selected elements, i.e. 
the autocorrelation function (e.g. Bornancini et al. 2006). 



3.1.1 Cross-correlation function for haloes 

In order to test whether our simulation is able to reproduce 
the observed signal of assembly bias at large scales found by 
other authors, Figure [3] shows the spatial cross-correlation 
function for haloes of different formation times and a given 
mass against the full population of haloes in the simulation, 
at z = (top panels). The two ranges of masses shown in 
the figure are the same as those used by G05 in two panels of 
their Fig. 2 which exhibit the assembly bias effect. The result 
for the 20% oldest St haloes in each mass range is shown as 
dot-dashed red lines, while that for the 20% youngest haloes 
is shown as dotted blue lines. Note that in the panels of their 
Figure 2, G05 show the autocorrelation function of haloes. 
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Figure 3. Main panels (top): Two-point cross-correlation function for haloes from our simulation for two different ranges of mass (left 
and right panels). The old population is represented as dot-dashed red lines, whereas the young one appears as dotted blue lines (the full 
population of haloes are shown as solid black lines). Error bars were calculated using the jackknife method. Lower panels: ratio between 
the bias of old and young objects in our simulation (solid lines) and in G05 (dashed lines). At large scales, both simulations show a higher 
clustering for the old haloes with respect to the young ones with a remarkable difference for the lowest mass bin (left-hand panel). Our 
simulation is able to measure the assembly bias effect with a high statistical significance. 



To compare their estimates of assembly bias with ours, we 
consider the expression found in Mo & White (1996) for the 
bias of a given halo sample, bn, on large scales, 

£ HB (r,M) = b 2 H (M)£ rnrn (r), (6) 

where (,hh is the autocorrelation function for haloes and 
( mm is that for the underlying matter, which assumes that 
the halo density field is proportional to the matter density 
field times the bias parameter. If the population of haloes 
is separated into old and young subpopulations, the ratio 
between the bias of these samples is, in the G05 case, 



i>H,old _ I £,HH,old ^ 
b H ,young oung 

In our case, we calculate the cross-correlation function 

as 

Uw{r,M) = b H (M)b H ,(M)£ mm (r). (8) 

The subscript H refers to the selected haloes in a mass bin, 
whereas H' represents all the haloes in our simulation. Then, 

bn.old _ £,HH' ,old ,„% 

I — 7 ■ l y J 

U H, young SH H' , young 



panels of Figure [3] for the G05 and our simulation, respec- 
tivelyQ 

As can be seen from these panels, both simulations show 
a higher clustering for the old population than that for the 
young one, and it can also be seen that our simulation can 
reproduce the assembly bias effect with an appropriate sta- 
tistical significance, particularly for low-mass haloes. 

3.1.2 Galaxy cross- correlation functions 

The top row of Figure [4] shows the spatial cross-correlation 
function between galaxy samples of different relative ages 
St but equal host halo masses, and the full population of 
galaxies in our simulation (~ 63,000 objects). Using S t , we 
find an assembly bias effect in our galaxies where the old 
population (red filled triangles and open circles) shows a 
higher clustering than the young population (blue open and 
filled squares), being this effect much stronger for the low- 
mass regime. The lower box in each panel shows the ratios 
between the correlation function of the oldest population 
(red triangles) and the total population, and between the 
youngest population (filled blue squares) and the total one, 
as dot-dashed red and dotted blue lines, respectively. The 
error of the ratio between £(r) for the oldest and youngest 
objects is shown as a shaded region around the value that 
would be obtained if both correlation functions were the 
same (ratio equal to unity). 

As can be seen from the lowest-mass bin (top left 



These ratios are shown as dashed and solid lines in the lower 



2 The cosmology in G05 was adjusted to that used in this paper. 
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Figure 4. Correlation functions for the different mass bins indicated in each panel. Old and young galaxies are shown in red and blue 
symbols, respectively. The figure key shows the ranges of St corresponding to the different symbols. Error bars were calculated using the 
jackknifc method. The lines repeated in each top box are obtained from the non-linear and linear power spectra, P(k) (labels indicated 
in the top left panel. See details in Section l3.2l l. The lower box in each panel corresponds to the ratio between the correlation function of 
the oldest population (red triangles) and the total population of the selected sample and, also, between the youngest population (filled 
blue squares) and the total one (dot-dashed red and dotted blue lines, respectively). The error of the ratio between the £(r) of the oldest 
and youngest objects is shown as a shaded region around the unit ratio. Top row. The age definition using the virial mass of the host 
halo. Notice the strong difference of almost two orders of magnitude between the old and young populations at r ~ f50 ft -1 kpc for the 
lowest mass bin. Bottom row. Galaxies are selected according to a tentative new mass measurement, M m i x (see Section [4Tj . 



panel), the amplitude of clustering is higher for the old 
population than the young one, particularly for scales 
80 kpc < r/ft -1 < 1.5 Mpc (one-halo term). This could in- 
dicate that their density profiles are different, probably those 
of the young population being dynamically less internally 
evolved. The strong difference in clustering at distances be- 
yond 1 h~ x Mpc may imply that if the mass in the vicinities 
(surrounding areas or the infall region) of haloes were taken 
into account, it would show no dependence on age. In other 
words, as the virial mass of haloes is not good enough as an 
overdensity peak height estimator in the simple EPS picture, 
this alternative could provide a better estimator for this 
peak height. HT10 detected an assembly-type bias for the 
dark matter halo major merger rate using the Millennium 
Simulation (|Springel et al. 20051 ). and proposed a physical 
mechanism for this effect that, as was mentioned above, ex- 
tends out to ~ 250 kpc. Owing to the result seen in the top 
row of Figure 3J it is possible that to explain the assembly 
bias one would need to characterise the peak with mass on 
scales larger than the virial radius (see Section [4}, extend- 



ing the local definition of peak from within a halo to larger 
scales usually regarded as part of the global environment. 

Additionally, recent studies have suggested that a pop- 
ulation of subhaloes that were expelled from larger haloes 
located beyond three times the virial radius of the main halo 
could explain the age dependence of the clustering in the 
low-mass regime (|Dalal et al. 20081 ; iLudlow et al. 20091 ). Al- 
though Wang et al. (2009) found that these low-mass haloes 
are not the main source for the assembly bias, they claim 
that environmental effects at large scales have a very im- 
portant role on this issue. In this case, the mass of the 
expelled subhaloes are bad indicators of their peak height, 
which could be better represented by the higher mass of a 
larger halo. 

Note that our results reproduce those found by previ- 
ous authors where the dependence of the clustering on the 
assembly history is only detected in low-mass haloes. This 
indicates that the peak may include matter around haloes 
to distances that depend on both halo mass and age. 



/. The nature of assembly bias 7 



3.2 Theoretical estimates of £(r) 

The statistical properties of the density fluctuation field can 
be represented by its power spectrum P(k), or equivalently 
by its dimensionless power spectrum A 2 (k), 

A 2 « = ^P(k), (10) 

which measures the power per logarithmic unit bin in 
wavenumber k. This spectrum is a direct manifestation of 
the hierarchical growth of structures, where small-scale per- 
turbations collapse first to grow and later form larger-scale 
perturbations which will collapse and form larger objects, 
in a non-linear process as time progresses. Directly related 
to this evolution are the abundance and clustering of galaxy 
systems and their variations as a function of mass and red- 
shift. The Fourier transform of the power spectrum results 
in the matter correlation function 

The Smith et al. (2003) fitting model provides a good 
estimate for £ mm (r) from the non- linear and the linear 
power spectra (jBovlan-Kolchin et al. 20091 '). In order to do 
the comparison with the correlation function of galaxies in 
the simulation, we use Equations ([8]) and The bias pa- 
rameter b is estimated by using the fit proposed by Seljak 
& Warren (2004), 

b (x = M/Mra) = 0.53 + 0.39a; ' 45 + °" 13 - 

40x + 1 

+ 5xlO"V' 5 , (12) 

with an accuracy on the bias-halo mass relation at the level 
of 3 per cent for b < 1. Here, M n ; refers to the non-linear 
mass, defined as the mass within a sphere for which the 
rms fluctuation amplitude of the linear field is 1.69 times 
the critical density of the Universe which corresponds to 
the gravitational co llapse in the spherical collapse model 
(G mm fc Gott 1972T ). For our simulation, we find M n i ~ 
2.4 xlO 13 ft" 1 M . 

The top boxes in each panel of Figure [4] show £(r) as 
obtained from the non-linear power spectrum (solid black 
line) with 6=1; from the non-linear P(k) (lower solid green 
line) with b — 0.6733 (corresponding to the average host halo 
mass); from the linear P(k) with 6=1 (dotted black line); 
and from the linear P(k) (lower dotted magenta line) with 
the bias factor for the average host halo mass. The biased 
£(r) obtained from the non-linear P(k) (lower solid green 
line) is expected to represent the correlation function with- 
out the assembly bias for the average host halo mass. The 
scale where the linear and non-linear power spectra start 
to diverge is around ~ 1.5 h~ l Mpc, and it is also out to 
where the correlation function shows the stronger difference 
in clustering between the old and young populations (top 
left panel of Figure [4|. Therefore, we will start studying 
scales of this tentative size for estimating the height of the 
mass peak to see how it affects the galaxy clustering (Sec- 
tion [4TTJ . Later in this paper we will carry out a \ 2 search 
for this scale and its dependence on halo properties, since 



the scale may introduce a large change in the resulting peak 
mass function. 



4 REDEFINITION OF AN OVERDENSITY 
PEAK HEIGHT 

We propose to extend the proxy for peak height to larger 
scales so that it does not show the assembly bias effect. The 
scale will include the mass of the peak which has already 
collapsed but also, in some cases, some of the mass that, 
due to global environmental effects, has not done so yet. 
This will be equivalent to a new definition of "halo." For 
each galaxy we will consider all the dark matter particles 
within a scale that will depend on the host halo mass and 
its age (see Section l4.3p . The mass contained in this halo, 
together with the stellar age of each galaxy, will be used to 
study the large-scale bias. 

Throughout this section, two different approaches that 
characterise the assembly bias, the two-point correlation 
function and the infall velocity profile, will be presented and 
later used to define the overdensity peak height. They will 
allow one to parametrise the scale which will trace the as- 
sembly bias at large scales. 

4.1 Using £(r) to determine the presence of an 
assembly-type bias 

In a first attempt, we approximate the peak height for each 
galaxy considering all the DM particles inside a radius of 1.5 
and 1.7 Mpc for old and young objects, respectively, 
motivated by the results of the previous section. We then 
repeat the same procedure described in Section 12.11 In this 
case, Equation ([2} is applied using this new mass definition, 
Mmix. The results are shown in the bottom row of Figure 
[4] The left column shows the lower mass bin, whereas the 
most massive bin is shown in the right column. This peak 
height definition cannot fully trace the assembly bias at large 
scales for each mass bin, although it does a better job than 
the virial mass. Therefore, a redefinition of halo mass in- 
cluding larger scales than the virial radius could recover the 
simple prescription where the bias responds to the height 
of the mass peak alone. To achieve this goal, it thus seems 
necessary to consider influences beyond the virial radius, 
probably reaching the infall region of haloes. 

It is important to point out that the redefinition of mass 
does not affect positions and hence only changes the relative 
age St of a galaxy. 

Since this tentative approximation may be overcor- 
rected as the scale may depend on different parameters (e.g. 
mass, age), the next section will present an estimator for the 
size of the region to use for this new definition of peak height 
based also on the velocity profile of the infall region. Then, 
we will combine the constraints obtained from correlation 
functions and infall velocities to estimate a parametrised 
proxy for the peak height. 

4.2 Using the Infall Velocity to determine the 
presence of an assembly-type bias 

The infall velocity profile v in f around galaxies is another 
statistic which is sensitive to the assembly bias. The Vmf 
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Figure 5. Radial velocity profiles around galaxies. The age parameter 5t is given by the virial mass. The figure keys show the ranges of 
St corresponding to the different symbols. The black solid line corresponds to the situation where the galaxies are not split according to 
their ages. Left: All haloes. In this case, the infall velocity for young galaxies (lower dotted and solid blue lines) does not have as clear 
a peak as that for old objects (upper solid and dashed red lines). Right: A subsample with high-mass objects, M v i r ^ 10 13 h~ 1 Mq. 
The old and young galaxies (red and blue, respectively) have an akin profile, as both populations show a peak at ~ 1.3 h" 1 Mpc. This 
correlates with the smaller difference in clustering between old and young galaxies hosted by high-mass haloes. 



values should depend on the initial density fluctuations as 
well as the clustering because, at large scales, the behaviour 
of haloes (and galaxies) is dominated by the collapse of these 
perturbations. 

We calculate the radial velocity profile around galaxies, 
which can be expressed as 

v r (r) = v nt (r) - v Ci , (13) 

where v Ci is the projected velocity of the central galaxy of 
the new halo along the direction between this galaxy and 
its i th neighboring galaxy located at a distance r, whose 
projected velocity along this direction is v ni (r). The infall 
velocity at a distance r around galaxies is the average value 
of v r (r). 

The Vinf profiles for young and old galaxies according 
to their virial masses are quite different (left-hand panel 
of Fig. [S]). While the old galaxies (upper solid and dashed 
red lines) have a peak in their infall velocity distribution at 
1.5 /i -1 Mpc, the young population does not have a clear 
maximum (lower dotted and solid blue lines). The top of 
this distribution is rather flat around 1.3 /i -1 Mpc. However, 
galaxies located in high mass haloes as those plotted in the 
top right panel of Fig. U should have similar velocity profiles, 
since they do not have a strong assembly bias. Their velocity 
profiles are shown in the right-hand panel of Figure Both 
populations show similar infall velocity behaviours and a 
peak of the distribution at ~ 1.3 h~ x Mpc. 

The aim of the next section is to find the best values 
of the radius enclosing the mass of the density peak as a 



function of both age and mass, in order to obtain similar 
velocity profiles and correlation functions for galaxies of very 
different ages but equal masses. 

4.3 Parametrising a new overdensity peak height 
proxy 

The previous sections have shown that a new proxy for the 
peak height could better characterise on average the assem- 
bly bias effect seen at large scales than the proxy given by 
the virial mass. Apart from the difference in clustering be- 
tween populations of different ages but equal mass, the ve- 
locity profile could also be used to detect this bias. 

We parametrise the radius of each galaxy as a function 
of both virial mass and S t . We then measure the masses 
inside spheres defined by this radius and calculate their rel- 
ative ages with respect to this mass. Finally, a x 2 statistics 
between the young and old populations of the differences 
between velocity profiles, X«(r)> an d correlation functions, 
x\( r ) i wm be used to find the best parameter set that traces 
more accurately the assembly bias. 

The radius for each galaxy is parametrised as 

r = a St + b lo g (M^) , (14) 

where M n i is the non-linear mass defined by Seljak & Warren 
(2004, see Section EOJ), log(A/ n! //i _1 M ) = 13.38 for our 
choice of cosmological parameters. The free parameters are 
a and b. The new peak height proxy will be the mass M 
enclosed within this radius. It is assumed that if r is smaller 
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Table 1. Best-fit parameters a and b from Equation l|14jl . The 
last column shows the reduced \ 2 value. 



a 


b 






x 2 


0.00 


-0.07 


7.80 


23.68 


31.48 


0.20 


-0.02 


15.57 


44.36 


59.93 



than the virial radius r V i r or if M is smaller than the virial 
mass, then M = M vir . 

Once the new mass contained in this radius and its St 
are measured, infall velocities and correlation functions are 
calculated for three bins in mass corresponding to the first, 
second, and third terciles of the mass distribution. This se- 
lection produces results for low, medium, and high masses, 
respectively. 

The x 2 f° r the infall velocity field is calculated as 

2 lp( 1 ^ [Vneg{r) -V pos {r)] 2 \ 

x^-iE^E ^ 1- w 

The x 2 value for the i th mass bin is performed within the 
range 2.5 < r/h' 1 Mpc < 5, since this interval corresponds 
to the two-halo regime; v ncg is the mean radial velocity 
around galaxies with 8 t < —0.05 (young objects) and v pos is 
this same quantity for St > 0.05 (old galaxies). The error is 
estimated as 0" 2 ( r ) = o" 2 „ cg + & 2 pos , with the first term being 
the error for v neg and the second term the error for v poa , 
calculated as the error of the mean within the interval of 
interest. The symbol ridof denotes the number of degrees of 
freedom. The value xt(r) ls the average over the three mass 
bins. 

Similarly, the reduced \ 2 f° r the correlation function 
statistics is defined as 




and is calculated within the range 0.8 r/h' 1 Mpc < 10, 
mostly in the two-halo term; N neg is the number of neigh- 
bours for young galaxies, whereas N pos is the same quan- 
tity for old objects. The number of neighbours is defined as 
N(r) = < N t (r) > f(r) + < N t (r) >, where < N t (r) > = 
N P airs (r) /N cen tres is the mean number of tracers[f] The error 
is cr 2 ^^) = <r% ne + a N pos , where the first term is the error 
for N neg and the second for N po3 , both calculated as the 
relative error of the number of neighbours. We choose this 
alternative to normalise the reduced \ 2 m order to avoid se- 
lecting parameters favoured by large uncertainties that can 
induce spuriously good fits. The value x\( r ) ls the average 
over the three mass bins. 

The best-fit values are shown in Table [1] They were ob- 
tained by marginalising the reduced x 2 f° r both the infall 

3 This relation comes from £(r) = N ( r ) ,. < , Ar '^''^ > . Therefore, if 

sv ) <Nt(r)> ' 

the distribution of neighbours is random, §(r) would be equal to 
zero. 



velocity and correlation function statistics. Such marginali- 
sation was done by integrating the likelihood 

fix) = e- ( *-*""" )2/2 , (17) 

where x-min is the minimum reduced x value for a specific 
set of parameters. The final value is simply the sum of both 
results (last column in Table [TJ. We find that f(x) has two 
maxima. The one corresponding to the best fit is that with 
a = and b = —0.07. The fit corresponding to the second 
maximum in likelihood is that with a — 0.2 and b = —0.02. 
It is worth to mention that the reduced x 2 value allows us 
to find the best parameters for a given sample and is not 
used with the aim of looking for the ideal parameters that 
would result in x 2 ?S 1, since Eq. (|14|l is only intended as an 
approximation to a more precise peak height proxy. 

The infall velocity profiles and correlation functions for 
the parameters a = and b — —0.07 are shown in Fig- 
ures [6] and [7] respectively. Notice that in this case the size 
of the sphere in Equation (|14p depends only on the halo 
mass, specifically r = — 0.07\og(M V i r /M n i). The infall ve- 
locity profiles for old and young galaxies are very similar 
for each mass range. Furthermore, the correlation functions 
for these populations are remarkably similar at scales r > 1 
h~ x Mpc for each mass bin, indicating that the assembly 
bias is not present using this redefinition of overdensity peak 
height. For the case a — 0.2 and b — —0.02, which depends 
on both the mass and age, we obtain similar correlation 
functions and infall velocity profiles, although with slight 
amplitude differences between populations of equal masses 
but different ages (not shown in Figs. [6] and [7] to improve 
the clarity of the figures). 

Notice that with this new definition of peak height the 
one-halo terms of old and young objects of equal mass are 
comparable, a property which the virial mass was not able 
to produce. 

Figure [8] shows the mass function for the parameters 
a = and b — —0.07 as filled black circles, for a — 0.2 
and b — —0.02 as open triangles, and for the virial mass 
as filled green squares. For comparison, the predicted mass 
functions from the extended Press- Schechter model (EPS) 
and from the Sheth, Mo, & Tormen (2001, SMT) model are 
shown as long-dashed and dot-dashed lines, respectively. At 
low masses, the a = 0, b = —0.07 distribution shows an 
unphysical behaviour, with very few objects at M ~ 10 10 
h' 1 M Q and a bump at ~ 10 10 7 h' 1 M . This means that 
most of the galaxies hosted by haloes in this range of virial 
mass (M v i r , green squares) changed their masses to M ~ 
10 10 ' 7 ft -1 Mq after using these parameters. However, the 
mass function changes only slightly with respect to that of 
the virial mass when using the second-best fit values a = 0.2 
and b = —0.02 (open black triangles), which also reduces 
the assembly bias by introducing a dependence on the age. 
None of the two parametrisations change the mass function 
at M ^ 10 12 h~ 1 MQ, and therefore in this range their mass 
functions and the one resulting from the virial mass are all 
consistent with the SMT prediction. Therefore, we consider 
the second-best fit a better candidate since, by introducing 
a smaller variation in the mass, we find no assembly bias 
and a good agreement with SMT. 

Figure [9] shows the distribution of r/r V i r for two differ- 
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r/h 1 Mpc r/h 1 Mpc r/h 1 Mpc 



Figure 6. Infall velocity profiles for the best-fit parameters a = and b = —0.07 (see Table[T]l. Solid lines in red are for old objects and 
dotted lines in blue are for young ones. Error bars were calculated using the jackknife method. Vertical lines mark the range in which 
the reduced xj( r j i s calculated. The lower-mass bin is on the left-hand panel, whereas the more massive bin is on the right-hand panel. 
The mass M shown in the figure key is in units of h~ 1 Mq. Old and young objects show very similar infall velocity profiles, irrespective 
of the range in mass. 




log r/h 1 Mpc log r/h 1 Mpc log r/h 1 Mpc 



Figure 7. Correlation functions for the different mass bins indicated in each panel. Old (red circles) and young (blue squares) objects 
are selected by using the radius parametrisation in Equation 1141 given by the best-fit parameters a = and b = —0.07 (see Table [TJ. 
Error bars were calculated using the jackknife method. The lines repeated in each top box are obtained from the non-linear and linear 
power spectra, P(k) (see Section j 3 - 2 6 . Lower boxes are as in Fig. [4] The vertical lines mark the range in which the reduced X% r ) i s 
calculated. Note that the assembly bias is not present at large scales (r > 1 h _1 Mpc) in any of the mass bins presented. For smaller 
scales, the differences in the clustering amplitude between old and young populations are typically below a factor of two. 

Table 2. Maximum and median radii, r max and < r >, respectively, in physical units (h -1 kpc) from Equation 1141 . as given by the 
best-fit parameters in Table[T]for all objects and, also, split among the old and young populations of galaxies. The ranges in virial mass 
are those shown in Fig. [9] 



best-fit params. 


ages 


Tmax 




< r > 


Tmax 




< r > 






log(M vir /h- 1 M Q ) 




10.3 - 10.7 


log(M„ ir .//i- 1 iW Q ) 




10.8 - 11.4 


a = 0.0, b = -0.07 


all 


215.5 




205.3 


180.6 




165.4 




old 


215.5 




204.2 


180.6 




166.3 




young 


215.5 




207.5 


180.6 




163.1 


a = 0.2, b = -0.02 


all 


415.9 




55.9 


445.2 




86.6 




old 


415.9 




101.8 


445.2 




123 




young 


383.5 




51.1 


407.8 




76.6 
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Figure 8. Mass function obtained from using the virial mass 
(M v i r , filled green squares) and the new masses from the best- 
fit parameters a = 0, b = —0.07 (filled black circles) and a = 
0.2, b = —0.02 (open black triangles). Error bars correspond to 
the Poisson error. For comparison, we plot the mass functions 
from the EPS (long-dashed line) and the SMT (dot-dashed line) 
models. The best agreement with the SMT mass function is shown 
by the results from virial masses and those from the second-best 
set of parameters a = 0.2 and b = —0.02. 

ent bins in virial mass for the two sets of best-fit parame- 
ters in Table [1] Both panels show that most of the galaxies 
keep their original halo masses, M V i r , when using the best- 
fit parameters a = 0.2 and b = —0.02 (solid lines). For the 
lower-mass bin (left panel), these galaxies have a median 
value of r = r«»r. For the case a = and b = —0.07 (dashed 
lines), they have a median value of r ~ 4 r V ir- The maximum 
and median radii (in units of rw) and, also, the number of 
objects which change their mass, decrease for higher virial 
masses. Table [2] shows these radii in units of kpc. All the 
galaxies with M V i r 6x 10 12 h~ 1 AlQ conserved their virial 
masses, i.e. M — M V i r , in both cases. This means that some 
objects which were initially considered as those with low 
peak heights, as given by their virial mass, are now asso- 
ciated to regions with higher overdensities, particularly for 
low virial masses. 



5 PROPERTIES OF THE NEW PEAKS 

We have presented a new proxy for the peak height that can 
account for the assembly bias at large scales (Section [4}. In 
some cases this model considers the mass enclosed by radii 
greater than the virial radius, inside which one could be in- 
cluding other haloes. In order to see differences between the 
old and young populations and how these could affect the 
statistics for £(r) and Vi n f(r), Figure [TD] shows the number 
of haloes inside each new peak height (when r > r V i r ), ex- 
cluding the central galaxy, as a function of the ratio between 



their virial mass and the virial mass of the central galaxy, 
M V ir_ c , for the mass ranges and parameters shown in Fig- 
ure [9] As can be seen from both panels, there is a trend 
where the number of haloes contained in spheres of size r 
around young galaxies (blue) is lower than that for spheres 
around old objects (red), the effect being stronger for higher 
virial masses (right panel). Therefore, the peak for an old 
galaxy, after taking into account the parametrisation of the 
radius from Equation (|14|) . adds more haloes and mass than 
for a young object. Furthermore, we can see that the higher 
the virial mass, the lower is the influence of other haloes in 
defining the new peak height for galaxies. 

Another interesting result is that, for the parameters 
a = 0.2 and b = —0.02, both old (solid red lines) and 
young (dotted blue lines) populations tend to add massive 
peaks, as can be seen from the left panel of Figure 1101 
which shows that the maximum ratio where the distribu- 
tion is non-zero is \og(M V i r / M V i r _ c ) ~ 4, but the minimum 
is log(M v ir /M V ir_c) ~ —0.6 and 0.4 for the old and young 
galaxies, respectively. For the case a = and b — —0.07, 
young objects (long-dashed blue curves) show a peak around 
M v ir — 0.6xM V ir_c, but they include a broad range of more 
massive haloes. Old objects (dashed red curves) are char- 
acterised by this same behaviour and additionally show 
a peak at \og{M v i r / M v i r _ c ) ~ 1.7. These results indicate 
that old, low-mass objects are surrounded preferentially by 
high-mass haloes. The latter is consistent with recent re- 
sults which show that old, low-mass gala xies suffer trunca- 
tion of matter by nearby massive h aloes l|Wang et al. 20071 ; 
iDalal et al. 20081 : iHahn et al. 20091 ). However, our results 
also indicate that there is a population of low-mass objects 
which are surrounded by smaller masses. In particular, for 
the a = 0.2 and b = —0.02 case, this is only seen for old 
objects, regardless of their M vir _ c . It is possible that galax- 
ies with low and high M V i r / M V i r _ c ratios correspond to dif- 
ferent aspects of the assembly bias phenomenology. This, 
along with studies of the prevalence of this bias by varying 
the concentration, number of satellites, triaxiality, spin, and 
other halo parameters, are the focus of a forthcoming paper 
(Lacerna et al. in preparation). 



6 CONCLUSIONS 

We have presented a new approach to estimate the overden- 
sity peak height with the aim to understand the assembly 
bias effect. This is a relevant issue that could affect the abil- 
ity of the next generation of galaxy surveys to infer accurate 
cosmological parameters. Our method consisted in redefin- 
ing the overdensity that characterises each galaxy using the 
information of its virial mass and the relative age, St; this 
new definition is proposed as a better alternative than the 
virial mass. Wang et al. (2007) pointed out that old, low- 
mass haloes at 2 = are associated to higher overdensities in 
the initial conditions, compared to what would be expected 
from their final virial masses. Instead of searching for the 
overdensity at high redshifts, we try to obtain a measure 
of the present-day peak height, which in turn can be tested 
using large, z = surveys. In order to do this, we mea- 
sure the assembly bias amplitude using two estimators, the 
two-point correlation function and the infall velocity profile. 
We find that when using the mass inside spheres of radius r 
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Figure 10. Number of haloes inside radius r (Eq. 1141 given by the best-fit parameters in Table ffl o = 0, 6 = —0.07 (dashed and 
long-dashed curves around old and young galaxies, respectively) and a = 0.2, b = —0.02 (solid and dotted curves around old and young 
galaxies, respectively). The results are plotted as a function of the virial mass normalised by the virial mass of the central galaxy, M v i r _ c . 
The mass range is shown in each panel, where M v i r _ c is in units of h Mq. 



from Equation (|14[) with the parameters in Table [T] galaxies 
do not show significant differences in the two-halo regime 
for objects of a given mass range but different age. Further- 
more, the dependence on the age is reduced in the one-halo 
term as well; the biggest difference is of a factor of two for 
the lowest mass bin at a separation of r ~ 150 h~ x kpc, 
which — when using virial masses — becomes a difference of 



two orders of magnitude in the clustering amplitude at the 
same scale. 

The best-fit parameters a = and b = —0.07 imply that 
the relative age is not strictly necessary (see Equation I14|l 
to find a peak height that includes the mass that has not 
collapsed completely onto haloes yet, and at the same time 
traces the assembly bias. We found that the best parameters 
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are those that yield median sphere radii in the range of 1 - 
4 r v i r . Clearly, environmental effects out to these distances 
are playing the main role in shaping the two-halo term, as 
shown in Fig. [9] It is worth to point out that only low- 
mass objects, with M v i r ^ 6 x 10 12 Ii~ 1 Mq, are subject 
to a change in their peak heights, which coincides with the 
mass limit for assembly bias found by several authors (e.g. 
Gao et al. 2005). This is also the case for our second-best 
fitting parameters, a — 0.2 and b = —0.02, which introduce a 
dependence of the peak height on the age, and help trace the 
assembly bias while at the same time produce final masses 
that are in excellent agreement with the SMT mass function. 
Therefore, this option is the preferred one to obtain a proxy 
for the peak height which is not subject to the assembly bias 
effect. 

Neighbouring massive haloes that are typically at dis- 
tances out to 4 r V i r (see Figs.[9land ll0|l are probably respon- 
sible for these effects. These could disrupt the normal growth 
of small objects and, therefore, affect their ages. However, 
we also find a population of haloes which, with the new defi- 
nition, includes nearby low-mass haloes, particularly for old 
objects. 

To summarise, we stress the apparent fact that particu- 
larly for low-mass objects, the virial mass is not an adequate 
proxy for peak height in the standard EPS picture, because 
equal virial mass objects can actually belong to initial den- 
sity peaks of very different amplitude, as evidenced in the 
large differences shown in the 2-halo regime by statistics 
such as the correlation function and infall velocities. It is nec- 
essary to include a more global environmental component, 
i.e. the mass of the region that effectively characterises the 
peak height. When the latter is taken into account, we ob- 
tain the general prescription where the bias responds to the 
height of the mass peak alone at large scales. Further work is 
required in order to confirm that this proposed parametri- 
sation of the peak height is enough to account for other 
variations of clustering of equal mass haloes with different 
properties such as concentration, spin, etc. The next papers 
in this series will study this, along with an application of 
this method to large surveys. 
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